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Abstract: Cheng and Huang (2010) have recently proven that the bootstrap is asymptoti- 
caUy consistent in estimating the distribution of the M-estimate of Euclidean parameter. In 
this note, we provide a first theoretical study on the bootstrap moment estimates in semipara- 
metric models. Specifically, we establish the bootstrap moment consistency of the Euclidean 
^1^' parameter which immediately implies the consistency of t-type bootstrap confidence set. It is 

worthy pointing out that the only additional cost to achieve the bootstrap moment consistency 
beyond the distribution consistency is to strengthen the Li maximal inequality condition re- 
, quired in the latter to the Lp maximal inequality condition for p > 1. The key technical tool 

^S) ■ in deriving the above results is the general Lp multiplier inequality developed in this note. 

These general conclusions hold when the infinite dimensional nuisance parameter is root-n 
consistent, and apply to a broad class of bootstrap methods with exchangeable bootstrap 
weights. Our general theory is illustrated in the celebrated Cox regression model. 
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1. Introduction 



In semiparametric models, the asymptotic variance estimate for tlie Euclidean parameter is re- 
\ quired in the construction of confidence sets and test statistics based on the asymptotic normality 
result. For example, in the bootstrap inferences, the asymptotic variance estimate is needed to 



■ build the t-type confidence set which is known to have smaller coverage probability error than the 



pies. In the literature, there are two existing estimation procedures, i.e., the profile sampler [ij] and 
the observed profile information (l9l ]. The former (latter) method requires a careful choice of the 
prior on the Euclidean parameter (of the step size in calculating discretized information estimate) . 



'n^ \ percentile/hybrid confidence sets; see 24]. In general, the explicit variance estimation is not feasible 
0^ ' due to the presence of an infinite dimensional nuisance parameter; see 0, [2^ for numerous exam 

o ■ 



' parameter, and thus becomes a widely used semiparametric mference procedure, e.g.,!,^,^. 

Cheng and Huang (2010) have recently proven that the bootstrap is asymptotically consistent 
in estimating the distribution of the M-estimate of Euclidean parameter without requiring the nui- 
sance parameter to be root-n consistent. However, this distributional consistency does not imply the 
consistency of the bootstrap variance estimators. Inspired by the recent development in moment 



Subsampling [21[ is another possibility, but the optimal subsample size is difficult to choose in prac- 



tice. In contrast, the bootstrap can estimate the asymptotic variance without involving any tuning 



convergence of parametric (bootstrap) M-estimate, i.e., ll|, |20|, we provide a first theoretically 
rigorous study on the bootstrap moment estimates in semiparametric models. Specifically, we es- 
tablish the bootstrap moment consistency of the Euclidean parameter which immediately implies 
the consistency of t-type bootstrap confidence set with the help of the conditional Slutsky's Lemma. 
It is worthy pointing out that the only additional cost to achieve the bootstrap moment consistency 
beyond the distribution consistency is to strengthen the Li maximal inequality condition required 
in the latter to the Lp maximal inequality condition for p > 1. The key technical tool used in this 
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note is the general Lp multiplier inequality, which is also of independent interest. These general 
conclusions hold when the infinite dimensional nuisance parameter is root-n consistent, and apply 
to a broad class of bootstrap methods with exchangeable bootstrap weights. However, relaxing the 
root-n rate to the slower one appears to be quite difficult and will be left for future studies. In the 
end, we illustrate the practicality of the required conditions in the celebrated Cox regression model. 



2. Preliminary 



2.1. Semiparametric M-Estimation 



The semiparametric M-estimation, including the maximum likelihood estimation as a special case, 
refers to a general method of estimation. Let S be a Euclidean parameter of interest and 
r/ G 7^ be an infinite dimensional nuisance parameter, e.g., some function. The semiparametric M- 
estimator (^, rj) is obtained by optimizing some objective function m{6, rf) based on the observations 
(Xi, . . . ,Xn): 

(e,?i) = argsupeee,venY.1=iM0,v)iXi). (1) 

The form of the objective function depends on the context. For example, it could be the log- 
likelihood, quasi-likelihood [3l or some pseudo-likelihood function, e.g., [2d|. Define {do:Vo) = 
argsupg^Q ,^^-^ Exm{9,r]){X). Under mild conditions, Cheng and Huang (2010) show that 

A7V(0,S). (2) 

Note that 9 is semiparametric efficient and S is the inverse of the efficient information matrix when 
m(6, rf) is the log-likelihood function. 



2.2. Exchangeably Weighted Bootstrap 

Define the bootstrap M-estimator {6*,rj*) = arg supg^Q^^^-^ SILi "^(^' 'n){^i)^ where {X^, . . . , X*) 
is the bootstrap sample. Note that the Efron's nonparametric bootstrap consists of independent 
draws with replacement from the original observations. In this case, we can re-express 

n 

(r,^*) = arg sup y^Wmm{0,7]){X,), 

where (Wni, ■ ■ ■ , Wnn) ~ Mult„(n, . . . , n~^)). This multinomial formulation can be naturally 
generalized to a class of exchangeable bootstrap weights { whose distribution corresponds to 
different bootstrap sampling schemes. This general bootstrap method, called exchangeably weighted 
bootstrap, was first proposed by Rubin (1981) and then extensively studied in [l|,[li, 22]. The class 



of exchangeably weighted bootstrap is practically useful. For example, in Cox regression model, 
the nonparametric bootstrap often gives many ties when it is applied to censored survival data due 
to its "discreteness" while the general weighting scheme comes to the rescue. Other variations of 
nonparametric bootstrap are also studied in 0] using the term "generalized bootstrap" . 



The bootstrap weights WniS are assumed to satisfy the following conditions given in 22l |: 

Wl. The vector Wn = iWni-, ■ ■ ■ , Wnn)' is exchangeable for all n = 1, 2, . . ., i.e., for any permutation 
TT = (tti, . . . , TTn) of (1, 2, . . . , n), the joint distribution of iT{Wn) = (Wmn i • • • , WnnnY is the 
same as that of Wn- 
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W2. Wm > for all n, i a nd XlHi ^ni = n for all n. 

W3. Let ||W„i||2,i = /q°° Y^PvK(|W^ni| ^ Assume lim sup„_i.oo ||W„i||2,i ^ for some positive 

constant C < oo. 
W4. limA_>oolimsup„^ooSuPi>At^^V(W^ni > t) = 0. 
W5. (l/n)Er=i(^™-l)'^c2>0. 

Condition W3 is slightly stronger than the bounded second moment but is implied whenever (2 + e) 
absolute moment exists for any e > 0; see Appendix IA.31 By the Markov's inequality, Condition 
W4 is satisfied if the (2 + e') moment of Wni is finite for some e' > 0. The value of c in W5 is 
independent of n and depends on the resampling method, e.g., c = 1 for nonparametric bootstrap. 
The bootstrap weights corresponding to nonparametric bootstrap satisfy W1-W5. In the below, 
we present several bootstrap examples satisfying Wl - W5 as shown in Praestgaard and Wellner 
(1993) in which we can find more details. 
Example 1. i.i.d.- Weighted Bootstraps 

In this example, the bootstrap weights are defined as Wni = Wj/;D„, where u)i,u)2, ... ,u}n are i.i.d. 
positive r.v.s. with ||wi||2,i < oo. Thus, we can choose Ui ~ Exponential(l) or Wj ~ Gamma(4, 1). 
The former corresponds to the Bayesian bootstrap. The multiplier bootstrap is often thought to 
be a smooth alternative to the nonparametric bootstrap; see 15[. The value of is calculated as 
Var{u}i)/{EuJi)^. 

Example 2. The delete-h Jackknife 

In the delete-h jackknife [s^], the bootstrap weights are generated by permuting the deterministic 
weights 

Wn = < T, • • • , T, U, . . . , U > Wltn > Wni = Tl. 



( n n ^ yr-^ 

< -, . . . , -, 0, . . . , ^ with } w, 

[ n — h n — h I ^-^ 



Specifically, we have Wnj = WnR„{j) where Rni-) is a random permutation uniformly distributed 
over {1,...,?7-}. In Condition M5, = h/{n — h). Thus, we need to choose h/n — >• q G (0,1) 
such that c > 0. Therefore, the ordinary jackknife with /i = 1 is inconsistent for estimating the 
distribution. 

Example 3. The Double Bootstrap 

In the double bootstrap, the bootstrap weights have the following distribution 

{Wnl, • • • , Wnn) ~ Mult„ (n, {Wni/u, . . . , W^„„/n)) , (3) 

conditional on Wn following Mult„(n, (n^^', . . . , n~^)). The value of c is \/2 in this example. 
Example 4- The Polya-Eggenberger Bootstrap 

In this example, the bootstrap weights follow the multinomial distribution 

{Wnl,. ■ ■ , Wnn) ~ Mult„ (n, {Dnl, . . . , £>„„)) , (4) 

conditional on {Dni, ■ ■ ■ , Dnn) ~ Dirichlet„(a, . . . ,a) with a > 0. The value of is calculated as 
{a + l)/a. 

Example 5. The Multivariate Hypergeometric Bootstrap 

As a particular urn-based bootstrap, the bootstrap weights follow the multivariate hypergeomet- 
ric distribution with density 



P{Wnl =Wi,...,Wnn=Wn)= ' ,nK^ i^) 
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for some positive integer K. Condition W5 is satisfied with (? = {K — 1)/K. 

Under Conditions Wl - W5 and otlier regularity conditions, Cheng and Huang (2010) prove 

iV^/c) (e* -6) ^ N(0,^) conditional on Xn = {Xi, . . . , X^), (6) 

Pw\xMV^/c){e* -e)<x)- p{N{o,^) <x) = op^i), (7) 



sup 



where " " represents the conditional weak convergence (in probability) defined in [8] and 
Pw\Xn is the conditional probability given Xn- In view of ([6]), the bootstrap variance estimate for 
9 is constructed as 

t* = {n/^)Ew\;,J*-er\ (8) 

where E\Y\Xn is the conditional expectation given the observed data We say that the bootstrap 
variance estimate is consistent if S* — ^ S. In practice, S* can be well approximated as follows: 



where 9*{h) is computed based on the 6-th bootstrap sample, for sufficiently large number B of 
bootstrap repetitions. 



3. Main Result: Bootstrap Moment Consistency 



In this section, we will establish the bootstrap moment consistency of 9 which directly implies the 
consistency of S* and i-type bootstrap confidence set. To guarantee the bootstrap p-t\\ moment 
consistency, the only additional cost is to strengthen the Li maximal inequality condition, which is 
needed in showing the bootstrap distribution consistency, to the Lpi maximal inequality condition 
for p' > 1; see Condition M2. A simple sufficient condition for (jlip in terms of bootstrap weights, i.e., 
(jlSI) . is also given. We verify it in the above five bootstraps besides the nonparametric bootstrap. 

It is well known that the convergence in distribution implies the convergence in moment under the 
uniform integrability condition. Lemma 2.1 of Kato (2011) further shows that the above argument 
is also valid for the conditional weak convergence in the case of nonparametric bootstrap. In fact, 
his arguments (after minor modifications) can also be applied to the above class of exchangeably 
weighted bootstrap; see below Lemma [TJ 

Lemma 1. LetT* he a scalar statistic of {Xi, ... ^Xn) and (Wni, ... ,Wnn)- Suppose that bootstrap 
weight Wn satisfies Wl - W5 and the conditional distribution of T* given converges weakly 
to some fixed distribution p. in Px -probability. If ET^\x„\T*y = Opy(l) for some q' > 1, then 

Ew\xSTnV ^ I t'Mt) for any integer 1 < q < q' . 

Let " ^ " (" ;^ ") denote smaller (greater) than, up to an universal constant. Denote Exw 
and Pxw as the joint expectation and joint probability, respectively. Let Pxf = J fdPx, Pn/ = 
Er=i f{Xi)/n and P*/ = ELi fiX*)/n = ^^=1 Wnif{X^)/n. Define the (bootstrap) empirical 
process and its norm as G„/ = ^/n{Fn - Px)f (<G*/ = \^i^n - Pn)/) and \\Gn\\r = sup/gjr \Gnf\ 
(||G*||jr = supjgjr |G*/|), respectively. For any class of functions A under the metric d, we de- 
fine log A''[] (e, ^, d) and log A^(e, ^, d) as the e-bracketing entropy number and e-entropy number. 
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respectively. The related bracketing entropy integral and uniform entropy integral are thus 

J[](5,^,d) = ^ ^l + logiV[](e,A(^)de, 

J {5, A) = sup / J 1 + log iV(ep||i^(Q), ^, L2(Q))de, 

where ^ is the envelop function of A, and the supreme is taken over all discrete probability measures 
Q with P||l2(q) > 0. 

In the following, we provide a set of sufficient conditions for bootstrap moment consistency. 
Ml. For any {O^r]) G & x7i, we have 

Ex{m{9,r])-m{9o,r]o)) < - \\e - Oof - d\r],7]o)- (9) 

M2. Define Ms = {m{e,r]) - m(6'o,r/o) : 1161 - 6lo|| < S,d{r], tjq) < 5,{9,r]) £ @ xV.}. We assume 
that, for some p' > 1 and every 5 > 0, 

[ExllGnlfj^^y^'' < 6, (10) 

[Exw\\G*ji,y'' < 6. (11) 

M3. Assume that d(j],rjo) = Op^{n~^/'^) and d{rf,r]o) = Op^,^ (n"^/^). 

Condition Ml assumes the quadratic behavior of the criterion function {0,ri) i— t- Ex'm{9,r]). Con- 
dition M2 assumes two maximal inequalities in terms of Lp/-norm for p' > 1. Both conditions are 
assumed in the global sense which is absolutely needed to achieve the moment consistency. The 
convergence rate of the bootstrap estimate in Condition M3, i.e., d(r]*,rjo) = Op^^(n~^/^), can 
also be understood in the following way: for any 6 > 0, there exists a < L < oo such that 

Px {Pw\Xn iVnd{r]*,r]o) > L) > 5) — >0 as n oo. 

We can verify Condition M3 using Theorem 2 of [6] . In the proof of our main Theorem [H we find 
that relaxing the above root-n rate to the slower one appears to be quite challenging, and will leave 
this topic for future studies. 

In the below, we discuss three different approaches for the verification of pop . Lemma 2.14.1 in 
[26| implies that 

[Ex\\Gnrj^,y' < Jil,Ms)\\Ns\\L,,^,(^P,), (12) 

where A'^^ is the envelop function oi Ms- Thus, Condition (jlOp holds if 

J{l,Ms) < oo, (13) 
l|iV5llL,,,,(Px) ^ ^- (14) 

The typical function classes with finite uniform entropy integral include the VC class and the 
related larger VC-hull class; see their definitions in Section 2.6 of [26]. Under the (global) Lipschitz 
continuous condition: 

\m{0,7]){x) -m{9o,r]o){x)\ < M{x){\\9 - 9o\\ + d{rj,rjo)) for any {9,7]) £ Q x n, (15) 
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we can show (HH) if ExM2Vp'(X) < oo. Alternatively, by decomposing {m{6,r]) — m(^0)%)) as the 
sum of {m{6, rj) — m{9Q, rj)) and {m{9Q, t]) — m{9Q, %)), we can also verify (fTUj) if the following holds: 

W , \W 



^x||G„||^J <oo and (^Sx||G„||^^J < 

where TVA^ = {{d / de)m{e , rj) ■■M^&oW < ^,d{v,Vo) < ^} andMs2 = {m(6'o, r/) -771(6*0, %) : d{v,Vo) < 
5}. Again, Lemma 2.14.1 in 26| can be applied here. Our third approach is to bound the higher 

moments (E'x for p' > 1 by plus some norm of A''^, based on the following 

two inequalities: 



IIP' 



^x||G„||^J < Ex\\Gn\Ws + n'^''^'^''' \\NsU^, for 1< p' < 2, (16) 



mGnfj^J ^ ExWGnlWs+n-'^^^'^"' {Ex\Ns\P')"' forp'>2, (17) 



where || • ||^ is the Orlicz norm with ipp{t) = exp(t^') — 1. The above two inequalities are derived 
based on Theorem 2.14.5 in (2^ and the fact that the ■^p-norm dominates the Lp-norm for each p. 
Now, we assume (jlSp . When p' > 1 but 7^ 2, the second term in the right hand side of (jl6p ( (jl7p ) 
converges to zero as n — )• 00 if ||M||^^, < 00 (||M||j;^^,(p^) < 00). When p' = 2, the second term 

in the right hand side of ([T7D is of the order 0{6) if ||M||2,2(p_,^) < 00. Thus, if -Ex ||Gn||A/'i ~ 
we can show (jlOp . Fortunately, several technical tools are available to compute the upper bound of 
^zl|Gn||A/f ill terms of the bracketing entropy integral (using Theorem 2.14.2 or Lemma 3.4.2 in 
[26( 1) or the uniform entropy integral (see van der Vaart and Wellner (2011)). For example, in view 
of the above analysis and Theorem 2.14.2 in f2()], a simple sufficient condition for (jlOp is 

Jl]{l,J\fs,L2(Px)) + II^IUp/v2 < 00 and Condition 

due to the fact that the ■i/'p-norm dominates the Lp-norm and ^g-norm for each p and q < p. 

To verify ([TT]) . we will employ the general Lp multiplier inequality developed in Appendix IA.4l to 
bound (-ExvkIIG* 11^^)"^/^'. According to Appendix lA. 51 it suffices to show the following bootstrap 
weight condition 

satisfies Conditions WSkWA; (18) 

if (|10p holds. Condition (jl8p is essentially very weak; see discussions in Examples 1-5 below. In the 
end, we want to point out that Conditions Wl - W5 and Ml - M3 (when p' = 1) are also needed 
in showing the bootstrap distribution consistency ([6]) & d?]); see Theorems 1 & 3 of [6|. In view of 
the above discussions, it appears that we only need to strengthen the Li maximal inequalities to 
the Lpi maximal inequalities for > 1 to achieve the bootstrap moment consistency beyond the 
distribution consistency. 

Let ET^ represent the p-th moment of any random vector T. 

Theorem 1. Suppose that Conditions Wl - W5 and Ml - MS hold. If 0* is distribution consis- 
tent, i.e., (0), then we have 

Ewix„{V^ie*-6)r^ETP, (19) 
where T ~ A^(0, S), for any integer p satisfying 1 < p < p' . 
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An obvious implication of Theorem [T] is that the bootstrap moment estimate of arbitrary order 
is consistent if Condition M2 is vahd for all p' > 1. It is worthwhile to remark that the uniform 
integrability of 9, i.e., Ex \\\/n{9 — Oo)\\^ < oo, is also proven in the proof of Theorem[TJ Thus, under 
the same set of conditions, the moment convergence of 6 also follows. In addition, Theorem [1] is 
also valid even for the approximate maximizer, i.e., 

fnm{e,^) > Fnm{eo,vo)-Op^{n~^), 

by slightly modifying its proof. 

The distribution consistency result ([7]) directly implies the consistency of bootstrap hybrid and 
percentile confidence sets. Given the consistent variance estimate S based on (Xi, . . . ,Xn), the 
more accurate ^-type bootstrap confidence set is constructed as 



BCtia) 



n(l-a/2) 2" n{a/2) 



n \ n 



where uj^a satisfies Py^\xS^^f^l ^^*^~^^'^^^* — ^) < '^no) — Note that cj*^ is not unique when 
^ is a vector. The following Corollary theoretically justifies the widely used bootstrap variance 
estimate S*, and further establishes the consistency of t-type confidence set BCt{a). 

Corollary 1. Suppose that Conditions in Theorem {1\ hold. If Condition M2 holds for some 
p' > 2, then we have 

S* ^ S, (20) 
PxwiOo £ BCtia)) 1-a (21) 

as n ^ oo. 

The variance consistency ()20p directly follows from Theorem [H To show the consistency of t- 
type confidence set, i.e., (j2T|) . we apply the Slutsky's Lemma and its conditional version given in 
Appendix I A. 2 1 (together with Lemma 4.6 of [22]) to ^ and Thus, for any fixed x G M'', we 
obtain that 

Px{V^t-^/^(e-9o)<x) ^(x), (22) 

Pw\P^A{V^/c){^*)-'^\0*-O)<x) ^ ^(x), (23) 



where ^(x) = P{N{0,I) < x). A straightforward application of Lemma 23.3 in 27[ concludes the 
proof of ^ based on (|22]) fc (|23D. 

In the end of this section, we will verify the bootstrap weight condition (jlSp in six different types 
of bootstraps including the nonparametric bootstrap introduced in Section 12.21 

Example 1. i.i.d.- Weighted Bootstraps (Conf) 

We will show that (llSp holds under the assumption that has bounded (2 + e)p'-th moment for 
some e > 0. This implies that 

llwf ||2,i < oo (24) 



based on Appendix IA.3I The derivations in Page 2080 of [221 ] give that 



PwiW^'i >t)< P{uji > t'^P'il - e)) + t-P/(2p')^P/2p(g)"/2 (25) 
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for any 0<e< l,p>0 and some < p{e) < 1, which further imphes that 

/oo j poo 
^P{oj{ > t{l - e)P')dt + nP/V(e)"^^ J t-^'^P' dt 

By choosing p > 4p', we know that hmsup„^Qo || W'^]^||2,i < oo due to To see that W^-^ satisfies 
Condition W4, it suffices to show that limt^oot'^P{^i > t) = according to ([25]) . This is implied 
by the Markov's inequahty and the assumption that ||a;i||(2+e)p' < oo. 
Example 2. The delete-h Jackknife ( Cont ') 

Recall that the bootstrap weight Wnj = WnR^{j)- Then, we have 

Ti — h 

PwiWni >t) = tJ{i : Wnj >t} = l{t < n/{n - h)}. (26) 

n 

In view of (I26p . Condition ()18p can be verified as follows 

^ , / n Y'-^/^ ( 1 

lim sup / \/Pw^n\ > u)du^ = lim sup = < oo, (27) 

n^ooJo n^oo\n-nJ \l - a J 

lim lim sup ^t^l 1 1 < , , , I = lim (1 - Q)t^\{t < (1 - a)~P'} = 0. (28) 
t^oo „^5o n I (n-/i)P J t^oo^ } \ \ ) i \ I 

A sufficient condition for (jlSp is 

lim sup EwW^^^^'^^' < oo for some e > 0. (29) 

n— >oo 

This can be proven based on the Appendix IA.3I and Chebyshev's inequality as remarked above. 
Thus, to guarantee the bootstrap variance consistency, i.e. Corollary [H we only need to require 

lim sup EwW^^ < oo (30) 

ra— )-cxD 

since we can always choose p' = 5/(2 + e) > 2 for some small enough e > 0. Assuming Wn = 
{Wni, Wnn)' = Mult„(n, (pi, . . . ,p„)), we have 

EwW^i = npi + 15n(2)p2 + 25n(^)pf + lOn^^Vt + n^^^pf , (31) 

where n'-'^^ = n(n — 1) • • • (n — k+l), according to Page 33 in [9]. If = 1/n for i = 1, . . . , n, we know 
EwW^i < 52. Thus, Condition ([30]) (also ([T8]) ) is trivially satisfied in the Efron's nonparametric 
bootstrap. Condition (I30p can be easily verified in the remaining examples 3-5 discussed before. 

Example 3. The Double Bootstrap (Cont') 

Based on ([3]) &: ([3T]) . we can compute EyyW^i as 

E{Ew{W^i\Wn)) 

= E (Wni + 15(n(2)/n2)w'2^ + 25{n^^^ /n^)W^^ + 10(n(^V»^^)^ni + (^^^V?^^)^^ni) , 
which implies Condition ([30]) since EW^^ < 52. 
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Example 4- The Poly a-Eggenberger Bootstrap (Conf) 
Following similar analysis in double bootstrap and Q, we have 

EwW^^ = E (nDni + 15n(2)Z)2^ + 25n^'^^Dl^ + Wn^^^D^^ + n^^^D^^^ . 

We can verify (j30p if we can show 

lim sup n^^^ED^^ < oo 

n—^oo 

for p = 1, . . . , 5. This is essentially true for all p based on the below derivations 

n(^^ED^^, = n(^) '^■■■(^+P-^) ^ ff ^ as n ^ oo, 
na---(na + p-l) a 

^ ^ ' k=l 



where the formula for calculating ED^^ is given in Page 96 of [l 
Example 5. The Multivariate Hyperneometric Bootstrap (Cont') 
According to ([5]) and Page 96 of [lO| , we have 

EwWl^ = OnM^) + 15an,i^(2) + 25a„,i^(3) + 10a„,/r(4) + a„,i^(5), 

where a„ bmce ttn K 

(r) < i^W, we can show limsup^^^xD ^wW^i < oo. 
4. Cox Regression Model with Right Censored Data 

We use the following Cox regression model to illustrate the practicality of the stated conditions 
Ml - M3. Indeed, the advantages of using bootstrap inferences in this model were considered in 
the literature, e.g., 0]. In the Cox regression model, the hazard function of the survival time T of 
a subject with covariate Z is modelled as: 

X{t\z) = Um^ ^P{t <T <t + A|r >t,Z = z) = X{t) exp{9'z), (32) 

where A is an unspecified baseline hazard function and is a regression vector. In this model, we 
are usually interested in 9 while treating the cumulative hazard function r]{y) = X{t)dt as the 
nuisance parameter. With right censoring of survival time, the data observed is X = (Y, 5, Z), where 
Y = T AC, C is a censoring time, 6 = I{T < C}, and Z is a regression covariate belonging to a 
compact set Z C M'^. We assume that C is independent of T given Z. The log-likelihood is thus 

m{6,r]){x) = 59' z — exp{6'z)rj{y) + 6 log r]{y}, (33) 

where ri{y} = r]{y) — r]{y—) is a point mass that denotes the jump of rj at point y. The parameter 
space 7i is restricted to a set of nondecreasing cadlag functions on the interval [0, r] with f]{T) < 
M for some constant M. It is well known that the MLE 9 is semiparametric efficient with the 
asymptotic variance obtained in 0]: 

E = 7^1 ^ {i?4,,„(X)4,^„(X)}"' , (34) 

where the efficient information matrix Iq is computed via the efficient score function 

T ,^ J Ee''^Zl{Y > y}\ e'.l'l Ee"' ^ Zl{Y > t]\ ^ , ^ 
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The negative second derivative of the partial hkelihood can be used to estimate . This is a 
special case of the observed pro file information defined as the negative second numerical derivative 
of the profile likelihood; see [19l |. In general, this approach requires a careful choice of the step size 
and crucially depends on the curvature structure of the profile likelihood which may not behave 
well under small sample. 

Cheng and Huang (2010) have shown that the exchangeably weighted bootstrap is consistent in 
estimating the limiting distribution of 0. Below, we will verify that Conditions Ml - M3 hold for 
this model such that the bootstrap is also consistent for estimating S. Since the true value (6*0, ??o) 
is the maximizer of [O^rf) i— )• Pxm{9,rf) (under certain identifiability condition), it is not difficult 
to verify Condition Ml by defining d{r],r]Q) = \\r] — ??o||oo5 where || - JJoo denotes the supreme norm. 
The convergence rates of rj (rj*) is established in Theorem 3.1 of [l9j] (Theorem 2 of lu\), i.e., 

\\V - VoWoo = Opj^{n~^) and ||??* - ??o||oo = Op^j^ (n~2 ). (35) 

Thus, we have verified Condition M3. To verify (jlOh in M2, we apply the first approach by showing 
(fT3|) & (HH)- Note that the class of bounded monotone functions, e.g., r]{y) and ri{y—), is VC-hull 
class. Considering the form of m{9,T]) (writing r]{y} = r]{y) — r]{y—)), we know that (fT3]l is satis- 
fied by the stability property of the BUEI function class, i.e.. Lemma 9.14 of H. Note that (dH) 
trivially holds since we can show (llSp with M(x) as some finite constant due to the compactness 
of Z and Ti. This also justifies 11-^(5 ||Lp,(Pjf) < 00. Thus, ([TT]) holds according to Appendix IA.5I 

Acknowledgment. The author thanks Professor Yoichi Nishiyama for sending me his technical 
note attached to Nishiyama (2010) and thanks Professor Jon Wellner for helpful discussions. 

Appendix 

For simplicity, we denote ||/||Q,r as the Lr{Q)-noicm of the function /. Let T* be a random vector 
composed of (Xi, . . . ,X„) and (Wni, . . . , Wnn)- According to we say that the conditional dis- 
tribution of T* given Xn converges weakly to some fixed distribution T in Px-probability, denoted 
as "T^ =^ T", if 




(A.l) 



where BLi is the class of Lispchitz functions bounded by 1 and with Lipschitz norm 1. 
A.l. Proof of Theorem [1] 

Chose some p" satisfying p < p" < p'. According to Lemma [H it suffices to show that 
supSw||\/^(^* -^o)ir" < 00 and sup ExWV^ie - eo)\\P" < 00. 

n n 

The latter result is a special case of the former since we may take Wni = 1 a.s. for z = 1, . . . , n. To 
show the former, it suffices to show 

snpExw{MP* -Oo\\+d{^,r]oW" < 00. (A.2) 

n 

To show ()A.2p . we need to partition the parameter space Q x T-L into "shells" Sj^n, i-e., 
Sj,n = {{9,ri)eexn: 2^-' < M\\9 - 9o\\ + d{r,,Vo)) < 2^} 
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with j ranging over integers, and then bound the probability of each shell under Conditions M1-M2. 
For any fixed jo > 0, we have 

Exw{M\r-Oo\\+d{ri*,vo)]V" 

< 2^o-^y'Pxw{V^{\\0* - 0o\\ + d{rr,Vo)) < 2^°-^) 

oo 

+ 2^"" Pxw{2^-' < V^iWe* - 9o\\ + difj*,m)) < 2^) 

j=jo 

< 2(^°-i)f" + V 2^'P"Pw ( sup P;(m(e,r/)-m(0o,%))>O) 

V(e,'?)GS,-,„ / 



J=JO 

oo 



> 



22j- 



3=30 



sup (P;-Px)M0,r/)-m(0o,%)) 



where the last inequality follows from Condition Ml. By the decomposition that (P* — Px)f = 
n~^/^(G* + Gn)f, we can further bound the second term in the above by 

y22^P" Pxwi sup GUm{e,r])-mieo,vo))^^] 

°° ( 22-?'"2\ 

+ V2^^'"Px sup G„(m(^,??)-m(^o,r?o)) ^ ^ 



oo 



oo 



22^-2/ v^y V2^^-V\/^ 



< 



The first inequality follows from Markov's inequality and Condition M2. Now, we can conclude that 

ExwW^W - ^oll +d(?r,r?o)]F' 

oo 



3=30 



since we assume that p" < p'. This concludes the proof. □ 



A. 2. Conditional Slutsky's Lemma 

Let T* and C„ be random vectors composed of {Xi, . . . , Xn) and (VFni, . . . , Wnn), and {Xi, . . . , Xn), 
respectively. If T* =^ T and Cn — ^ C for some vector C, then we have 

(i) Tn + Cn^T + C; 

(ii) CnTn CT; 

(iii) C-^Tn =^ C-^T provided C ^ 0, 

where the vector C in (i) must be of the same dimension as T and C in (ii) & (iii) can be a matrix. 

Proof: Without loss of generality, wc assume C to be a vector. If C is a matrix, the conclusions in 
(ii) and (iii) are still valid since the matrix multiplication and matrix inversion are both continuous 
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operations. We first show the conditional weak convergence {Tn,Cn) =^ {T,C), and then apply 
the conditional version of the continuous mapping Theorem, i.e., Theorem 10.8 in [13], to conclude 
the proof. We first show the following result: 

if [7* ^ [7 and \\U* - V*\\ 0, then V* =^ U, (A.3) 

where U* and V* are random vectors composed of (Xi, . . . , X^) and {Wni, ■ ■ ■ ,Wnn)- For any 
/ G BLi, we have 

\Ew\xJiU:) - Ewixjmi < eEwixjm: - V:\\ < e} + 2Ewi;,J{\\K " > e} (A.4) 

for every e > 0. The first term in the right hand side of ()A.4p can be made arbitrarily small by 
choice of e while the second term converges to zero in Px-pi'obability as n — )• oo. Thus, we claim 

sup \Ew\;,J{U:) - Ew\xJiV:)\ ^ 0. 
feBLi 

Considering the definition (jA.ip and U* =^ U, we complete the proof of (jA.Sp . According to ()A.3p . 

it suffices to show {T*,C) =^ {T,C) since \\{T*,Cn) - {T*,C)\\ = ||C„ - C\\ ^ 0. It is easy to 
show that for every bounded Lipschitz function {x,y) i— t- f{x,y), the function x ^ f{x,c) is also 
bounded and Lipschitz continuous. Thus, if T* =^ T, then we have 

sup \EwixJiT:,C)-Ef{T,C)\< sup |i?^|^J(r„*) - i^/(r)| ^ 0. 

feBLi feBLi 

Again, an application of (jA.ip completes the whole proof. □ 



A.3. An Inequality for \\ • \\2,i-norm 

For any y > and r > 2, we have 



^ni2 < ni2,i < ^r^nir, (a.5) 

where = {EY'')^/''. 

Proof: The first inequality is established as follows: 

roo poo 

\\Y\\l = 2 tP{\Y\ > t)du = 2 WP{\Y\ > t)y/P{\Y\ > t)dt < 2||y||2,i||r||2 
Jo Jo 

by Markov's inequality. For the second inequality, we have 

||l^||2,i = + VPi\Y\>t)dt 

/•oo 

< a + ||y||;/2 / t~r/^dt 

J a 

< a+\\YfJ^'-^—^U{a) 

r — 2 

for any a > 0. It is easy to show that the minimal of U{a) is just [r/{r — 2)]||y||r when a = ||y||r- 
This completes the proof of the second inequality in (jA.Sp . □ 
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A.4- The Lp Multiplier Inequality 

Let Wn = {Wni, ■ ■ ■ jWnn)' be non-negative exchangeable random variables on (W,^^, Pw) such 
that, for every n, Rn = \/ Pw'^nX > u)du < oo. Let Zni, i = l,2,...,n, be i.i.d. random 
elements in , , P^) with values in i°°{Tn), and write || • ||„ = supjgjr^ \Zni{f)\- It is assumed 
that Zni^s are independent of Wn- Then for any no such that 1 < no < oo and any n > uq, the 
following inequality holds for any p > 1: 



1 

1=1 



< ^0 ||||^nlLllp^,p 



\inaXi<i<nWni\\p^^p 



Pxw,P 



n 



max 

no<i<n 



1 

J=no+l 



(A.6) 



Px,P 



Proof: This Lemma generalizes the results in Lemma 4.1 of [28| where p = 1. By the triangle 
inequality, we have 



1 

1=1 





< 




n 


Pxw,P 





^ no 
-^y^WniZn 





+ 




n 


Pxw,P 





1 

V WniZn 



i=no+l 



Pxw,P 



The first term in the above is trivially bounded by 

maxi<j<j2 



no \\\\Znl\ 



n\\Px,p 



n 



Denote as the ith ordered values of Wni, i.e., > W„(2) > ••• > l^n(n)- Note that 

nil SLi ^™^m||nl|Pxiv,p = \\\\Y17=i^n{i)Zni\\n\\Pxw,p smce W„ is Essumed to be exchangeable 
and P^ is permutation invariant. We write the second term as the following telescoping sum, 

^ n 1 " 

^nii)Zni = -l^ Yl ^(^n(i) - ^n(m))^i> 
i=no+l i=no+l 

where Tj = l^j=ri,(,+i -^nj and Wn(^n+i) = 0- Thus, we obtain that 



Wn{i)Zn 

i=no+l 



Pxw,P 



< 



< 



/ II « ll™ 

i=no+l 



max ||Tj||„ 

no<i<n 



Px,P 



j=no+l 



Pw,P 



Recalling the definition of Tj , it remains to show 

Ew( Y ^iiWnii)-Wni^+l))] < n^/'i?™. 



(A.7) 



\i=no+l 



By some algebra, the left hand side of ()A.7p can be re-written as 



Ewi Y =Ew{ \n{r > 1 : M^„(r) > u}du , 

V*=„o+l-^^n{«+i) J \Jo ^ J 
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which is bounded by 

Ew {K^ > 1 : W^{r) > n}Y'^ du < nP'^Ew / {tJ{r > 1 : > u}/nY'^ du 

Jo Jo 

POO 

< nP/^Ew / {tt{r > 1 : W^u. > n}/n}'/' du 

Jo 

roo 

< / ^yPwiWni > u)du = nP/'^Rn 
Jo 

based on the Jensen's inequahty. This completes the whole proof. □ 
A. 5. Verification of Condition ( li ip 

Suppose that the Lpi maximal inequality (fTOj) and bootstrap weight condition (fT8|) hold. If || ||p;^,p' < 
oo, then we have Condition (llip for each p' > 1. 

Proof: We first apply the symmetrization argument to show 



'n\\Ms\\pxw,p' ~ ^ 



1 " 



(A. 



Pxw,p' 



Note that 



-, n -\ 

by Condition W2. Let = (VF^i, . . . , be exchangeable bootstrap weights generated from 

Pw'i an independent copy of Pw- The bootstrap weight conditions Wl and W2 imply that 
Ey\/iW' ; = 1 for i = 1, . . . , n. Then, we have 



Exw 



Exw 



< ExwEw' 



l=Y.{Wm-mx.-Px) 

1 " 

Y.^Wm - Ew'WU){5x. - Px) 
1 " 

-j=Y.{Wni-WU){5x,-Px) 



i=l 



P' 



based on the Jensen's inequality and the reverse Fatou's Lemma. In the end, a typical application 
of the symmetrization argument and Minkowski's inequality concludes (|A.8p . 

To further bound the right hand side of (jA.Sp . we next apply the Lp multiplier inequality ()A.6P 
with Zni = {Sxi — Px) and Tn = Ms- This gives, due to Condition W3, 



IWsWp 



XW,P 



< 





1 


max Wni 










+ 


max 


Px,P' Vn 


l<i<n 


Pw,P' 


no<i<n 



1 * 

J=no+l 



Px,p' 
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max 

no<i<n 



1 * 



Ms 



+ 



PX:P' 



^ no 
^ ^ 



Px,p' 



< 



< 2 



max. \\Gk\\j^ 

no<k<n 



+ 


\\^no\\j\fg 


Px,p' 





Px,p' 



Px,p' 



by the triangular inequality. In addition, we can bound ||||-Z'ni||A/'5l|Px,p' 

||||^nl||A/'J|Px,p' = llll'^Xi - Px\Ws\\Px,p' ^ IIII'^XiIIa/'JIpx,p' + II ll-P^ lUJIp^.p' < 2 

due to the reverse Fatou's Lemma. Thus, we obtain that 



Px,p' 



n\\Afs\\Pxw,p' 



Px,p' • 



max Wni 

l<i<n 



+ 



Pw,p' 



max ||(Ufc||^ 

no<k<n ° 



Px:P' 



^ I + 11. 



Considering Condition (llSh and Lemma 4.7 of [22], we have n-^/2£;^(^axi<i<„ VF^.) — > 0. The 
inequality that || maxi<j<„ Wni\\pyy,p' < £'H/(maxi<j<n, W^-) (due to maxi<j<„ W^. > 1) implies 



max Wni 

l<i<n 



0(1). 



(A.9) 



Pw,p' 



Since ||-A^5||px,p' is assumed to be finite, the above term / converges to zero, and thus is smaller 
than arbitrary 5 > for sufficiently large n. For any positive r.v. Y, it is easy to prove that 

/•oo 

EY1 = / qf-^PiV > t)dt for any g > 0. 
The Levy's inequality, i.e.. Proposition A. 1.2 in [iH ). implies that 



P (max IIGfcll^, > A ) < 2P {\\Gn\\Ms > ^) for every A > 0. 

\ k<n J 



Thus, we have that // < 2^/p'||||Gri||A/il|px,p'- This concludes the whole proof. □ 
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